Robust Topological Inference: Distance To a Measure and Kernel Distance

نویسندگان

  • Frédéric Chazal
  • Brittany Terese Fasy
  • Fabrizio Lecci
  • Bertrand Michel
  • Alessandro Rinaldo
  • Larry A. Wasserman
چکیده

Let P be a distribution with support S. The salient features of S can be quantified with persistent homology, which summarizes topolog-ical features of the sublevel sets of the distance function (the distance of any point x to S). Given a sample from P we can infer the persistent homology using an empirical version of the distance function. However, the empirical distance function is highly non-robust to noise and out-liers. Even one outlier is deadly. The distance-to-a-measure (DTM), introduced by Chazal et al. (2011), and the kernel distance, introduced by Phillips et al. (2014), are smooth functions that provide useful topologi-cal information but are robust to noise and outliers. Chazal et al. (2014) derived concentration bounds for DTM. Building on these results, we derive limiting distributions and confidence sets, and we propose a method for choosing tuning parameters. 1. Introduction. Figure 1 shows three complex point clouds, based on a model used for simulating cosmology data. Visually, the three samples look very similar. Below the data plots are the persistence diagrams, which are summaries of topological features defined in Section 2. The persistence diagrams make it clearer that the third data set is from a different data generating process than the first two. This is an example of how topological features can summarize structure in point clouds. The field of topological data analysis (TDA) is concerned with defining such topological features; see Carlsson (2009). When performing TDA, it is important to use topological measures that are robust to noise. This paper explores some of these robust topological measures. Let P be a distribution with compact support S ⊂ R d. One way to describe the shape of S is by

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A robust least squares fuzzy regression model based on kernel function

In this paper, a new approach is presented to fit arobust fuzzy regression model based on some fuzzy quantities. Inthis approach, we first introduce a new distance between two fuzzynumbers using the kernel function, and then, based on the leastsquares method, the parameters of fuzzy regression model isestimated. The proposed approach has a suitable performance to<b...

متن کامل

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

Robust Potato Color Image Segmentation using Adaptive Fuzzy Inference System

Potato image segmentation is an important part of image-based potato defect detection. This paper presents a robust potato color image segmentation through a combination of a fuzzy rule based system, an image thresholding based on Genetic Algorithm (GA) optimization and morphological operators. The proposed potato color image segmentation is robust against variation of background, distance and ...

متن کامل

Geometric Inference on Kernel Density Estimates

We show that geometric inference of a point cloud can be calculated by examining its kernel density estimate with a Gaussian kernel. This allows one to consider kernel density estimates, which are robust to spatial noise, subsampling, and approximate computation in comparison to raw point sets. This is achieved by examining the sublevel sets of the kernel distance, which isomorphically map to s...

متن کامل

Rates of convergence for robust geometric inference

Distances to compact sets are widely used in the field of Topological Data Analysis for inferring geometric and topological features from point clouds. In this context, the distance to a probability measure (DTM) has been introduced by Chazal et al. (2011b) as a robust alternative to the distance a compact set. In practice, the DTM can be estimated by its empirical counterpart, that is the dist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1412.7197  شماره 

صفحات  -

تاریخ انتشار 2014